Injective Hilbert Space Embeddings of Probability Measures
نویسندگان
چکیده
A Hilbert space embedding for probability measures has recently been proposed, with applications including dimensionality reduction, homogeneity testing and independence testing. This embedding represents any probability measure as a mean element in a reproducing kernel Hilbert space (RKHS). The embedding function has been proven to be injective when the reproducing kernel is universal. In this case, the embedding induces a metric on the space of probability distributions defined on compact metric spaces. In the present work, we consider more broadly the problem of specifying characteristic kernels, defined as kernels for which the RKHS embedding of probability measures is injective. In particular, characteristic kernels can include non-universal kernels. We restrict ourselves to translation-invariant kernels on Euclidean space, and define the associated metric on probability measures in terms of the Fourier spectrum of the kernel and characteristic functions of these measures. The support of the kernel spectrum is important in finding whether a kernel is characteristic: in particular, the embedding is injective if and only if the kernel spectrum has the entire domain as its support. Characteristic kernels may nonetheless have difficulty in distinguishing certain distributions on the basis of finite samples, again due to the interaction of the kernel spectrum and the characteristic functions of the measures.
منابع مشابه
Characteristic and Universal Tensor Product Kernels
Kernel mean embeddings provide a versatile and powerful nonparametric representation of probability distributions with several fundamental applications in machine learning. Key to the success of the technique is whether the embedding is injective. This characteristic property of the underlying kernel ensures that probability distributions can be discriminated via their representations. In this ...
متن کاملHilbert Space Embeddings and Metrics on Probability Measures
A Hilbert space embedding for probability measures has recently been proposed, with applications including dimensionality reduction, homogeneity testing, and independence testing. This embedding represents any probability measure as a mean element in a reproducing kernel Hilbert space (RKHS). A pseudometric on the space of probability measures can be defined as the distance between distribution...
متن کاملHilbert Space Embeddings in Dynamical Systems
In this paper we study Hilbert space embeddings of dynamical systems and embeddings generated via dynamical systems. This is achieved by following the behavioural framework invented by Willems, namely by comparing trajectories of states. As important special cases we recover the diffusion kernels of Kondor and Lafferty, generalised versions of directed graph kernels of Gärtner, novel kernels on...
متن کاملOn the Hilbert Space Embeddings and Metrics on Probability Measures
A Hilbert space embedding for probability measures has recently been proposed (Gretton et al., 2007; Smola et al., 2007), with applications including dimensionality reduction, homogeneity testing and independence testing. This embedding represents any probability measure as a mean element in a reproducing kernel Hilbert space (RKHS). Using this embedding, a pseudometric (let us define it as γk)...
متن کاملHypothesis testing using pairwise distances and associated kernels
We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. The equivalence holds when energy distances are co...
متن کامل